The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network
نویسنده
چکیده
Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the network. Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights. For example, consider a twolayer feedforward network of sigmoid units, in which the sum of the magnitudes of the weights associated with each unit is bounded by A and the input dimension is n. We show that the misclassification probability is no more than a certain error estimate (that is related to squared error on the training set) plus A 3 (log n)=m (ignoring log A and log m factors), where m is the number of training patterns. This may explain the generalization performance of neural networks, particularly when the number of training examples is considerably smaller than the number of weights. It also supports heuristics (such as weight decay and early stopping) that attempt to keep the weights small during training. The proof techniques appear to be useful for the analysis of other pattern classifiers: when the input domain is a totally bounded metric space, we use the same approach to give upper bounds on misclassification probability for classifiers with decision boundaries that are far from the training examples.
منابع مشابه
Diagnosis of brain tumor using image processing and determination of its type with RVM neural networks
Typically, the diagnosis of a tumor is done through surgical sampling, which is more precise with existing methods. The difference is that this is an aggressive, time consuming and expensive way. In the statistical method, due to the complexity of the brain tissues and the similarity between the cancerous cells and the natural tissues, even a radiologist or an expert physician may also be in er...
متن کاملApplication of Artificial Neural Networks and Support Vector Machines for carbonate pores size estimation from 3D seismic data
This paper proposes a method for the prediction of pore size values in hydrocarbon reservoirs using 3D seismic data. To this end, an actual carbonate oil field in the south-western part ofIranwas selected. Taking real geological conditions into account, different models of reservoir were constructed for a range of viable pore size values. Seismic surveying was performed next on these models. F...
متن کاملAn Improved Fuzzy Neural Network for Solving Uncertainty in Pattern Classification and Identification
Dealing with uncertainty is one of the most critical problems in complicatedpattern recognition subjects. In this paper, we modify the structure of a useful UnsupervisedFuzzy Neural Network (UFNN) of Kwan and Cai, and compose a new FNN with 6 types offuzzy neurons and its associated self organizing supervised learning algorithm. Thisimproved five-layer feed forward Supervised Fuzzy Neural Netwo...
متن کاملAircraft Visual Identification by Neural Networks
In the present paper, an efficient method for three dimensional aircraft pattern recognition is introduced. In this method, a set of simple area based features extracted from silhouette of aerial vehicles are used to recognize an aircraft type from its optical or infrared images taken by a CCD camera or a FLIR sensor. These images can be taken from any direction and distance relative to the fly...
متن کاملOn the use of Textural Features and Neural Networks for Leaf Recognition
for recognizing various types of plants, so automatic image recognition algorithms can extract to classify plant species and apply these features. Fast and accurate recognition of plants can have a significant impact on biodiversity management and increasing the effectiveness of the studies in this regard. These automatic methods have involved the development of recognition techniques and digi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Information Theory
دوره 44 شماره
صفحات -
تاریخ انتشار 1998